Search CORE

74 research outputs found

Recommending with limited number of trusted users in social networks

Author: Guan Donghai
Khattak Asad Masood
Yuan Weiwei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/05/2018
Field of study

© 2018 IEEE. To estimate the reliability of an unknown node in social networks, existing works involve as many opinions from other nodes as possible. Though this makes it possible to approximate the real property of the unknown nodes, the computational complexity increases as the scale of social networks getting bigger and bigger. We therefore propose a novel method which involve only limited number of social relations to predict the trustworthiness of the unknown nodes. The proposed method involves four rating prediction mechanisms: FM use the recommendation given by the most reliable recommender with the shortest trust propagation distance from the active user as the predicted rating, FMW weights the recommendation in FM, FA uses the mean value of recommendations with the shortest trust propagation distance from the active user as the predicted rating, and FAW weights recommendations in FA. The simulation results show that the proposed method can greatly reduce the rating prediction calculation, while the rating prediction losses are reasonable

ZU Scholars (Zayed University)

A Novel Hybrid Neural Network for Data Clustering

Author: Andrey Gavrilov
Donghai Guan
Publication venue
Publication date: 06/03/2020
Field of study

Abstract. Clustering plays an indispensable role for data analysis. Many clustering algorithms have been developed. However, most of them suffer either poor performance of unsupervised learning or lacking of mechanisms to utilize some prior knowledge about data (semi-supervised learning) for improving clustering result. In an effort to archive the ability of semisupervised clustering and better unsupervised clustering performance, we develop a hybrid neural network model (HNN). It is the sequential combination of Multi-Layer Perceptron (MLP) and Adaptive Resonance Theory-2 (ART2). It inherits two distinct advantages of stability and plasticity from ART2. Meanwhile, by combining the merits of MLP, it not only improves the performance for unsupervised clustering, but also supports for semi-supervised clustering if partial knowledge about data is available. Experiment results show that our model can be used both for unsupervised clustering and semisupervised clustering with promising performance

CiteSeerX

Mehanizam pretraživanja preporučitelja za sustave sigurnih preporučitelja u Internetu stvari

Author: Donghai Guan
Jianwei Niu
Lei Shu
Weiwei Yuan
Publication venue: 'KOREMA'
Publication date: 01/01/2013
Field of study

Intelligent things are widely connected in Internet of Things (IoT) to enable ubiquitous service access. This may cause heavy service redundant. The trust-aware recommender system (TARS) is therefore proposed for IoT to help users finding reliable services. One fundamental requirement of TARS is to efficiently find as many recommenders as possible for the active users. To achieve this, existing approaches of TARS choose to search the entire trust network, which have very high computational cost. Though the trust network is the scale-free network, we show via experiments that TARS cannot find satisfactory number of recommenders by directly applying the classical searching mechanism. In this paper, we propose an efficient searching mechanism, named S_Searching: based on the scale-freeness of trust networks, choosing the global highest-degree nodes to construct a Skeleton, and searching the recommenders via this Skeleton. Benefiting from the superior outdegrees of the nodes in the Skeleton, S_Searching can find the recommenders very efficiently. Experimental results show that S_Searching can find almost the same number of recommenders as that of conducting full search, which is much more than that of applying the classical searching mechanism in the scale-free network, while the computational complexity and cost is much less.Inteligentni objekti su naširoko povezani u Internet stvari kako bi se omogućio sveprisutni pristup uslugama. To može imati za posljedicu veliku redundanciju usluga. Stoga je za pronalaženje pouzdane usluge u radu predložen vjerodostojan sustav preporučitelja (VSP). Temeljni zahtjev VSP-a je učinkovito pretraživanje maksimalnog mogućeg broja preporu čtelja za aktivnog korisnika. Kako bi se to postiglo, postojeći pristupi VSP-a u potpunosti pretražuju sigurnu mrežu što ima za posljedicu velike računske zahtjeve. Iako je sigurna mreža mreža bez skale, eksperimentima je pokazano kako VSP ne može naći zadovoljavajući broj preporučitelja direktnom primjenom klasičnog algoritma pretraživanja. U ovom radu je predložen učinkovit algoritam pretraživanja, nazvan S_Searching: temeljen na sigurnim mrežama bez skale koji koristi čvorove globalno najvećeg stupnja za izgradnju Skeleton-a i pretražuje preporučitelja pomoću Skeleton-a. Iskorištavanjem nadre.enih izlaznih stupnjeva čvorova Skeleton-a S_Searching može s visokom učinkovitošću pronaći preporučitelje. Eksperimentalni rezultati pokazuju kako S_Searching može naći gotovo jednak broj preporučitelja koji bi se pronašli potpunom pretragom, što je mnogo više od onoga što se postiže primjenom klasičnog algoritma pretrage na mreži bez skale, uz znatno smanjenje računske kompleksnosti i zahtjeva

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Classification with class noises through probabilistic sampling

Author: Guan Donghai
Khattak Asad Masood
Ma Tinghuai
Yuan Weiwei
Publication venue: 'Elsevier BV'
Publication date: 01/05/2018
Field of study

© 2017 Accurately labeling training data plays a critical role in various supervised learning tasks. Now a wide range of algorithms have been developed to identify and remove mislabeled data as labeling in practical applications might be erroneous due to various reasons. In essence, these algorithms adopt the strategy of one-zero sampling (OSAM), wherein a sample will be selected and retained only if it is recognized as clean. There are two types of errors in OSAM: identifying a clean sample as mislabeled and discarding it, or identifying a mislabeled sample as clean and retaining it. These errors could lead to poor classification performance. To improve classification accuracy, this paper proposes a novel probabilistic sampling (PSAM) scheme. In PSAM, a cleaner sample has more chance to be selected. The degree of cleanliness is measured by the confidence on the label. To accurately estimate the confidence value, a probabilistic multiple voting idea is proposed which is able to assign a high confidence value to a clean sample and a low confidence value to a mislabeled sample. Finally, we demonstrate that PSAM could effectively improve the classification accuracy over existing OSAM methods

ZU Scholars (Zayed University)

Cost-sensitive elimination of mislabeled training data

Author: Chow Francis
Guan Donghai
Khattak Asad Masood
Ma Tinghuai
Yuan Weiwei
Publication venue: 'Elsevier BV'
Publication date: 01/09/2017
Field of study

© 2017 Elsevier Inc. Accurately labeling training data plays a critical role in various supervised learning tasks. Since labeling in practical applications might be erroneous due to various reasons, a wide range of algorithms have been developed to eliminate mislabeled data. These algorithms may make the following two types of errors: identifying a noise-free data as mislabeled, or identifying a mislabeled data as noise free. The effects of these errors may generate different costs, depending on the training datasets and applications. However, the cost variations are usually ignored thus existing works are not optimal regarding costs. In this work, the novel problem of cost-sensitive mislabeled data filtering is studied. By wrapping a cost-minimizing procedure, we propose the prototype cost-sensitive ensemble learning based mislabeled data filtering algorithm, named CSENF. Based on CSENF, we further propose two novel algorithms: the cost-sensitive repeated majority filtering algorithm CSRMF and cost-sensitive repeated consensus filtering algorithm CSRCF. Compared to CSENF, these two algorithms could estimate the mislabeling probability of each training data more confidently. Therefore, they produce less cost compared to CSENF and cost-blind mislabeling filters. Empirical and theoretical evaluations on a set of benchmark datasets illustrate the superior performance of the proposed methods

ZU Scholars (Zayed University)

Improving Complex Network Controllability via Link Prediction

Author: Fahim Muhammad
Guan Donghai
Khattak Asad Masood
Wei Ran
Yuan Weiwei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

© 2019, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering. Complex network is a network structure composed of a large number of nodes and complex relationships between these nodes. Using complex network can model many systems in real life. The individual in the system corresponds to the node in the network and the relationship between these individuals corresponds to the edge in the network. The controllability of complex networks is to study how to enable the network to arrive at the desired state from any initial state by external input signals. The external input signals transmit to the whole network through some nodes in the network, and these nodes are called driver node. For the study of controllability of complex network, it is mainly to judge whether the network is controllable or not and how to select the appropriate driver nodes at present. If a network has a high controllability, the network will be easy to control. However, complex networks are vulnerable and will cause declining of controllability. Therefore, we propose in this paper a link prediction-based method to make the network more robust to different modes of attacking. Through experiments we have validated the effectiveness of the proposed method

ZU Scholars (Zayed University)

Socialized healthcare service recommendation using deep learning

Author: Guan Donghai
Han Guangjie
Khattak Asad Masood
Li Chenliang
Yuan Weiwei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2018
Field of study

© 2018, The Natural Computing Applications Forum. Socialized recommender system recommends reliable healthcare services for users. Ratings are predicted on the healthcare services by merging recommendations given by users who has social relations with the active users. However, existing works did not consider the influence of distrust between users. They recommend items only based on the trust relations between users. We therefore propose a novel deep learning-based socialized healthcare service recommender model, which recommends healthcare services with recommendations given by recommenders with both trust relations and distrust relations with the active users. The influences of recommenders, considering both the node information and the structure information, are merged via the deep learning model. Experimental results show that the proposed model outperforms the existing works on prediction accuracy and prediction coverage simultaneously, even for cold start users or users with very sparse trust relations. It is also computational less expensive

ZU Scholars (Zayed University)

User behavior prediction via heterogeneous information preserving network embedding

Author: Guan Donghai
Han Guangjie
He Kangya
Khattak Asad Masood
Yuan Weiwei
Publication venue: 'Elsevier BV'
Publication date: 01/03/2019
Field of study

© 2018 Elsevier B.V. User behavior prediction with low-dimensional vectors generated by user network embedding models has been verified to be efficient and reliable in real applications. However, most user network embedding models utilize homogeneous properties to represent users, such as attributes or user network structure. Though some works try to combine two kinds of properties, the existing works are still not enough to leverage the rich semantics of users. In this paper, we propose a novel heterogeneous information preserving user network embedding model, which is named HINE, for user behavior classification in user network. HINE applies attributes, user network connection, user network structure, and user behavior label information for user representation in user network embedding. The embedded vectors considering these multi-type properties of users contribute to better user behavior classification performances. Experiments verified the superior performances of the proposed approach on real-world complex user network dataset

ZU Scholars (Zayed University)

Crossref

Semi-supervised Time Series Anomaly Detection Model Based on LSTM Autoencoder

Author: Guan Donghai
Khattak Asad Masood
Tu Yaofeng
Xiao Hui
Yuan Weiwei
Zhao Rui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/06/2021
Field of study

Nowadays, time series data is more and more likely to appear in various real-world systems, such as power plants, medical care, etc. In these systems, time series anomaly detection is necessary, which involves predictive maintenance, intrusion detection, anti-fraud, cloud platform monitoring and management, etc. Generally, the anomaly detection of time series is regarded as an unsupervised learning problem. However, in a real scenario, in addition to a large set of unlabeled data, there is usually a small set of available labeled data, such as normal or abnormal data sets labeled by experts. Only a few methods use labeled data, and the existing semi-supervised algorithms are not yet suitable for the field of time series anomaly detection. In this work, we propose a semi-supervised time series anomaly detection model based on LSTM autoencoder. We improve the loss function of the LSTM autoencoder so that it can be affected by unlabeled data and labeled data at the same time, and learn the distribution of unlabeled data and labeled data at the same time by minimizing the loss function. In a large number of experiments on the Yahoo! Webscope S5 and NAB data sets, we compared the performance of the unsupervised model and the semi-supervised model of the same network framework to prove that the performance of the semi-supervised model is improved compared to the unsupervised model

ZU Scholars (Zayed University)

Improved label noise identification by exploiting unlabeled data

Author: Chow Francis
Guan Donghai
Khattak Asad Masood
Wei Hongqiang
Yuan Weiwei
Zhu Qi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/02/2018
Field of study

© 2017 IEEE. In machine learning, the available training samples are not always perfect and some labels can be corrupted which are called label noises. This may cause the reduction of accuracy. Meanwhile it will also increase the complexity of model. To mitigate the detrimental effect of label noises, noise filtering has been widely used which tries to identify label noises and remove them prior to learning. Almost all existing works only focus on the mislabeled training dataset and ignore the existence of unlabeled data. In fact, unlabeled data are easily accessible in many applications. In this work, we explore how to utilize these unlabeled data to increase the noise filtering effect. To this end, we have proposed a method named MFUDCM (Multiple Filtering with the aid of Unlabeled Data using Confidence Measurement). This method applies the novel multiple soft majority voting idea to make use unlabeled data. In addition, MFUDCM is expected to have a higher accuracy of identifying mislabeled data by using the concept of multiple voting. Finally, the validity of the proposed method MFUDCM is confirmed by experiments and the comparison results with other methods

ZU Scholars (Zayed University)